Chimeric forecasting: combining probabilistic predictions from computational models and human judgment

Forecasts of the trajectory of an infectious agent can help guide public health decision making. A traditional approach to forecasting fits a computational model to structured data and generates a predictive distribution. However, human judgment has access to the same data as computational models plus experience, intuition, and subjective data. We propose a chimeric ensemble—a combination of computational and human judgment forecasts—as a novel approach to predicting the trajectory of an infectious agent. Each month from January, 2021 to June, 2021 we asked two generalist crowds, using the same criteria as the COVID-19 Forecast Hub, to submit a predictive distribution over incident cases and deaths at the US national level either two or three weeks into the future and combined these human judgment forecasts with forecasts from computational models submitted to the COVID-19 Forecasthub into a chimeric ensemble. We find a chimeric ensemble compared to an ensemble including only computational models improves predictions of incident cases and shows similar performance for predictions of incident deaths. A chimeric ensemble is a flexible, supportive public health tool and shows promising results for predictions of the spread of an infectious agent. Supplementary Information The online version contains supplementary material available at 10.1186/s12879-022-07794-5.

-Resolution Criteria:This question will resolve as the total number of adult plus pediatric previous day admissions with confirmed COVID-19 as recorded in the Department of Health and Human Service's report of COVID-19 reported patient impact and hospital capacity. The total previous day admissions is computed using two variables in this report: previous day admission adult covid confirmed and previous day admission pediatric covid confirmed. This report, and the resolution criteria, includes data on all 50 US states, Washington DC, Puerto Rico, and the US Virgin Islands (53 states and territories). The report will be accessed one week after the end of the month (2021-02-06). -Question: What factor should the median 4-week-ahead COVIDhub Ensemble forecast of national incident deaths made on 4 Jan(a forecast for the 24-30 Jan week) be multiplied by so that it equals the reported number of new US incident deaths?
-Resolution Criteria: This question will resolve as the factor that the The COVID-19 Forecast Hub's "COVIDhub" Ensemble median forecast of the US national number of incident deaths 4 weeks into the future should be multiplied by (the reported number of incident deaths divided by the forecasted median) to equal the reported number of US national incident deaths as reported by the Johns Hopkins University (JHU) CSSE Github data repository. -Range:[0-3] -Question URL https://pandemic.metaculus.com/questions/6163/factor-covidhub-fore cast-to-be-multiplied-by/ • Question 5 -Question: What will be the cumulative number of deaths due to COVID-19 on 2021-12-31 if less than 50% of Americans initiate vaccination (1st dose received) with a COVID-19 vaccine by 2021-03-01?
-Resolution Criteria: The percent of the population that received a COVID-19 vaccine on or before 2021-03-01 will be computed by dividing the number of individuals who have initiated vaccine (1st dose taken) provided by the CDC COVID data tracker by the current US population which on 2021-01-04 was reported to be 330,782,991 and multiplying this fraction by 100. The CDC COVID data tracker that counts the number of individuals who have initialized vaccination will be accessed when data is available after and as close as possible to 2021-03-01.
To resolve deaths, we will use the cumulative number of deaths due to confirmed COVID-19 as recorded in the Johns Hopkins University (JHU) CSSE Github data repository. This file records the daily number of deaths by county. From this file deaths are summed across all counties and aggregated to week to generate the number of new deaths per week. The report will be accessed one week after 2021-12-31. 9 January edit: This question will resolve ambiguously if greater than or equal to 50% of Americans are vaccinated by 2021-03-01. -Resolution Criteria: The percent of the population that received a COVID-19 vaccine on or before 2021-03-01 will be computed by dividing the number of individuals who have initiated vaccine (1st dose taken) provided by the CDC COVID data tracker by the current US population which on 2021-01-04 was reported to be 330,782,991 and multiplying this fraction by 100. The CDC COVID data tracker that counts the number of individuals who have initialized vaccination will be accessed when data is available after and as close as possible to 2021-03-01.
To resolve deaths, we will use the cumulative number of deaths due to confirmed COVID-19 as recorded in the Johns Hopkins University (JHU) CSSE Github data repository. This file records the daily number of deaths by county. From this file deaths are summed across all counties and aggregated to week to generate the number of new deaths per week. The report will be accessed one week after 2021-12-31. 9 January edit: This question will resolve ambiguously if less than 50% of Americans are vaccinated by 2021-03-01.  The dashboard is updated daily by 8pm ET and will be accessed on 2021-02-28 at approximately 10:00pm ET.
-Resolution Criteria: This question will resolve as the number of variants of concern at the following link: "US COVID-19 Cases Caused by Variants" page as of Sunday, 2021-03-07. For example, as of 2021-02-02 this page shows that there are three variants: B.1.1.7, B.1.351, and P.1. This page is updated on Sundays, Tuesdays, and Thursdays by 7pm ET and will be accessed at approximately 10pm ET on 2021-03-07 (a Sunday). -Resolution Criteria: This question will resolve as the percentage of US S:N501 sequences among all positive SARS-CoV-2 samples submitted for genomic sequencing to the GISAID database between 2021-03-29 and 2021-04-04 (inclusive), as displayed on the "Distribution of S:N501 per country" plot on following website: https://covariants.org/variants/S.N501. This website pulls data from GISAID and makes it publicly accessible. This percentage will be accessed no sooner than 2021-04-12. The total previous day admissions is computed using two variables in this report: previous day admission adult covid confirmed and previous day admission pediatric covid confirmed and stored in Lehigh University's Computational Uncertainty Lab Github data repository. This report, and the resolution criteria, includes data on all 50 US states, Washington DC, Puerto Rico, and the US Virgin Islands (53 states and territories). The report will be accessed no sooner than (2021-04-04). -Resolution Criteria: This question will resolve as the minimum CDC recommended percent of confirmed positive COVID-19 cases that should be sequenced that assumes community transmission. If the CDC does not release such guidance before the end of 2021, then the most-cited paper that provides a recommendation on the minimum recommended percent of positive COVID-19 cases that should be sequenced in the context of community transmission will be consulted on 1 January 2022. -Range:[0-100] -Metaculus question URL: https://pandemic.metaculus.com/questions/6718/-covid-cases -that-should-be-sequenced/ • Question 6 -Question: How many variants of concern will be monitored by the US CDC as of 4 April?
-Resolution Criteria: This question will resolve as the number of variants of concern monitored by the CDC as of Sunday, 2021-04-04. For example, as of 2021-03-02 this page shows that there are three variants: B.1.1.7, B.1.351, and P.1. This page is updated on Sundays, Tuesdays, and Thursdays by 7pm ET and will be accessed at approximately 10pm ET on 2021-04-04 (a Sunday). -Resolution Criteria: This question will resolve as the cumulative number of people who receive one or more doses of a COVID-19 vaccine on 2021-03-31 as recorded by the Centers for Disease Control COVID-19 Data tracker. The radio buttons "People Receiving 1 or More Doses" and "Cumulative" will be selected and the bar corresponding to 2021-03-31 will be accessed. Data is updated daily by 8pm ET and will be accessed no sooner than 2021-04-04. If the CDC changes how it reports vaccination data, we will provide clarifying language as necessary. For purposes of this question, a person receiving a single-dose vaccine would count as a person having received one or more doses of a COVID-19 vaccine. -Resolution Criteria: This question will resolve as the cumulative number of people who receive 2 doses of a COVID-19 vaccine on 2021-03-31 as recorded by the Centers for Disease Control COVID-19 Data tracker. The radio buttons "People Receiving 2 Doses" and "Cumulative" will be selected and the bar corresponding to 2021-03-31 will be accessed. Data is updated daily by 8pm ET and will be accessed no sooner than 2021-04-04. If the CDC changes how it reports vaccination data, we will provide clarifying language as necessary. For purposes of this question, a person receiving a single-dose vaccine would count as a person having received one or more doses of a COVID-19 vaccine.
Mar 8 edit: On 2021-03-08, the CDC's vaccine tracker at https://covid.cdc.gov/covi d-data-tracker/#vaccinations changed the "receiving 2 doses" figure to "fully vaccinated" to account for people who receive one dose of the Johnson & Johnson vaccine, which has been authorized as a single-dose regimen (by contrast, Pfizer/BioNTech and Moderna are authorized as two-dose vaccines). This question will resolve on the basis of the new "fully vaccinated" figure reported by the CDC. The total previous day admissions is computed using two variables in this report: previous day admission adult covid confirmed and previous day admission pediatric covid confirmed and stored in Lehigh University's Computational Uncertainty Lab Github data repository. This report, and the resolution criteria, includes data on all 50 US states, Washington DC, Puerto Rico, and the US Virgin Islands (53 states and territories). The report will be accessed no sooner than (2021-09-05). -Question: What will be the cumulative number of deaths in the US due to COVID-19 on 2021-12-31?
-Resolution Criteria: This question will resolve as the number of cumulative deaths due to confirmed COVID-19 on 2021-12-31 as recorded in the Johns Hopkins University (JHU) CSSE Github data repository. This file records the daily number of deaths by county. The number of cumulative deaths at the end of the year will be computed by adding the cumulative number of deaths across states. This data, and the resolution criteria, includes data on all 50 US states, Washington DC, Puerto Rico, and the US Virgin Islands (53 states and territories). The report will be accessed no sooner than 9 January 2022.  -Question: What will be the cumulative number of deaths in the US due to COVID-19 on 2021-12-31? -Resolution Criteria: This question will resolve as the number of cumulative deaths due to confirmed COVID-19 on 2021-12-31 as recorded in the Johns Hopkins University (JHU) CSSE Github data repository. This file records the daily number of deaths by county. The number of cumulative deaths at the end of the year will be computed by adding the cumulative number of deaths across states. This data, and the resolution criteria, includes data on all 50 US states, Washington DC, Puerto Rico, and the US Virgin Islands (53 states and territories). The report will be accessed no sooner than 2022-01-09. -Question: What will be the % prevalence of SARS-CoV-2 variants thought to partially escape immunity for the two-week period 23 May -05 Jun 2021? -Resolution Criteria: This question will resolve on the basis of the first update that shows figures for the two-week period ending 05 Jun of the "Weighted Estimates of Proportions of SARS-CoV-2 Lineages" table on the U.S. CDC's "Variant Proportions" page. The percentages of variants that cause "reduced neutralization by convalescent and post-vaccination sera" will be added up. If between now and 05 Jun there are additional variants classified by the CDC as variants that cause "reduced neutralization by convalescent and post-vaccination sera," these will count toward the total percent figure. Likewise, if any of the variants that are currently classified as causing partial immune escape are removed from being classified as such, they will no longer count toward the total percent figure.  -Question: What will be the cumulative number of deaths in the US due to COVID-19 on 2021-12-31?
-Resolution Criteria: This question will resolve as the number of cumulative deaths due to confirmed COVID-19 on 2021-12-31 as recorded in the Johns Hopkins University (JHU) CSSE Github data repository. This file records the daily number of deaths by county. The number of cumulative deaths at the end of the year will be computed by adding the cumulative number of deaths across states. This data, and the resolution criteria, includes data on all 50 US states, Washington DC, Puerto Rico, and the US Virgin Islands (53 states and territories). The report will be accessed no sooner than 9 January 2022. -Question: What will be the prevalence of SARS-CoV-2 variants thought to partially escape immunity for the two-week period 20 June -03 July 2021?
-Resolution Criteria: This question will resolve on the basis of the first update that shows figures for the two-week period ending 03 July of the "Weighted Estimates of Proportions of SARS-CoV-2 Lineages" table on the U.S. CDC's "Variant Proportions" page. The percentages of variants that cause "reduced neutralization by convalescent and post-vaccination sera" will be added up. If between now and 03 July there are additional variants classified by the CDC as variants that cause "reduced neutralization" by convalescent and/or post-vaccination sera, these will count toward the total percent figure. Likewise, if any of the variants that are currently classified as causing partial immune escape are removed from being classified as such, they will no longer count toward the total percent figure.

B. FORECASTING PLATFORMS
FIG. S1: The Metaculus format for submitting a forecast over a target of interest. Forecasters are presented with background information about the target and data sources that may be helpful when building a forecast. A forecaster can also view a consensus forecast from when the question was posed until present. A clearly defined question is asked in bold font and forecasters are also presented with the resolution criteria, how the ground truth for this question will be determined. Forecasters build a predictive distribution as a mixture of five logistic distributions. Forecasters, if they wish, can also share useful comments and questions with others in a chat box underneath their forecast.
FIG. S2: The Good Judgment Open format for submitted a forecast over a target of interest. The question is posed to forecasters at the top of the page in bold font and underneath the question is the resolution criteria describing how the ground truth for this question will be determined. Forecasters can also view the current and past consensus distribution for this question. Forecasters build their predictive distribution by sliding N bars which represent intervals over possible truth values. Forecasters, if they wish, can also share useful comments and questions with others 2 3 4 5 6 Survey 2 3 4 5 6 Survey 2 3 4 5 6 Survey 2 3 4 5 6 Survey 2 3 4 5 6 Survey FIG. S3: Paired difference in WIS score for predictions of incident cases (A.) and deaths (B.) for surveys 2, 3, 4, 5, 6 between a performance based and equally weighted computational ensemble (blue), human judgement (red), and chimeric ensemble (gold) using a "spotty memory" imputation approach.